Search CORE

171 research outputs found

Computational Nosology and Precision Psychiatry

Author: Friston KJ
Gordon JA
Redish AD
Publication venue
Publication date: 01/10/2017
Field of study

This article provides an illustrative treatment of psychiatric morbidity that offers an alternative to the standard nosological model in psychiatry. It considers what would happen if we treated diagnostic categories not as causes of signs and symptoms, but as diagnostic consequences of psychopathology and pathophysiology. This reformulation (of the standard nosological model) opens the door to a more natural description of how patients present—and of their likely responses to therapeutic interventions. In brief, we describe a model that generates symptoms, signs, and diagnostic outcomes from latent psychopathological states. In turn, psychopathology is caused by pathophysiological processes that are perturbed by (etiological) causes such as predisposing factors, life events, and therapeutic interventions. The key advantages of this nosological formulation include (i) the formal integration of diagnostic (e.g., DSM) categories and latent psychopathological constructs (e.g., the dimensions of the Research Domain Criteria); (ii) the provision of a hypothesis or model space that accommodates formal, evidence-based hypothesis testing (using Bayesian model comparison); and (iii) the ability to predict therapeutic responses (using a posterior predictive density), as in precision medicine. These and other advantages are largely promissory at present: The purpose of this article is to show what might be possible, through the use of idealized simulations

Directory of Open Access Journals

UCL Discovery

Bandit Models of Human Behavior: Reward Processing in Mental Disorders

Author: A Dezfouli
AD Redish
AM Taylor
D Bouneffouf
D Bouneffouf
DC Perry
LE Hess
M Luman
MJ Frank
P Auer
P Auer
P Auer
TL Lai
TU Hauser
W Thompson
WR Thompson
WW Seeley
Publication venue
Publication date: 07/06/2017
Field of study

Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. We demonstrate empirically that the proposed parametric approach can often outperform the baseline Thompson Sampling on a variety of datasets. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions.Comment: Conference on Artificial General Intelligence, AGI-1

arXiv.org e-Print Archive

Crossref

Altered Risk-Based Decision Making following Adolescent Alcohol Use Results from an Imbalance in Reinforcement Learning in Rats

Author: A Bechara
A Rangel
A Sclafani
AD Redish
AD Redish
AD Redish
AE Goudriaan
Andrew S. Hart
Anne L. Collins
CA Johnson
DJ Nutt
DL McKinzie
E Yechiam
ED Witt
F Crews
GS Corrado
Ilene L. Bernstein
J Peris
JC Stout
Jeremy J. Clark
JG March
LP Spear
MJ Frank
MJ Frank
MJ Wanat
NA Nasrallah
NA Nasrallah
NE Rowland
Nicholas A. Nasrallah
O Mihatsch
P Piray
Paul E. M. Phillips
PH Chiu
PN Tobler
PW Glimcher
RA Chambers
RA Rescorla
RM Philpot
S Kakade
S Reilly
TE Behrens
W Hodos
W Schultz
Wael El-Deredy
Y Niv
Y Niv
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Alcohol use during adolescence has profound and enduring consequences on decision-making under risk. However, the fundamental psychological processes underlying these changes are unknown. Here, we show that alcohol use produces over-fast learning for better-than-expected, but not worse-than-expected, outcomes without altering subjective reward valuation. We constructed a simple reinforcement learning model to simulate altered decision making using behavioral parameters extracted from rats with a history of adolescent alcohol use. Remarkably, the learning imbalance alone was sufficient to simulate the divergence in choice behavior observed between these groups of animals. These findings identify a selective alteration in reinforcement learning following adolescent alcohol use that can account for a robust change in risk-based decision making persisting into later life

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

University of East Anglia digital repository

FigShare

Imbalanced decision hierarchy in addicts emerging from drug-hijacked dopamine spiraling circuit

Author: A Dezfouli
A Dezfouli
AD Redish
AD Redish
Allan V. Kalueff
AW Stacy
BJ Everitt
Boris Gutkin
D Badre
D Badre
D Belin
D Belin
E Koechlin
G Di Chiara
GE Alexander
GE Alexander
I Willuhn
LJMJ Vanderschuren
LJMJ Vanderschuren
LV Panlilio
M Haruno
M Matsumoto
Mehdi Keramati
MJ Frank
MM Botvinick
MM Botvinick
ND Daw
ND Volkow
ND Volkow
P Dayan
P Piray
PW Kalivas
RJ Lamb
RZ Goldstein
SN Haber
SN Haber
V Deroche-Gamonet
W Schultz
Y Takahashi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 24/04/2013
Field of study

Despite explicitly wanting to quit, long-term addicts find themselves powerless to resist drugs, despite knowing that drug-taking may be a harmful course of action. Such inconsistency between the explicit knowledge of negative consequences and the compulsive behavioral patterns represents a cognitive/behavioral conflict that is a central characteristic of addiction. Neurobiologically, differential cue-induced activity in distinct striatal subregions, as well as the dopamine connectivity spiraling from ventral striatal regions to the dorsal regions, play critical roles in compulsive drug seeking. However, the functional mechanism that integrates these neuropharmacological observations with the above-mentioned cognitive/behavioral conflict is unknown. Here we provide a formal computational explanation for the drug-induced cognitive inconsistency that is apparent in the addicts' “self-described mistake”. We show that addictive drugs gradually produce a motivational bias toward drug-seeking at low-level habitual decision processes, despite the low abstract cognitive valuation of this behavior. This pathology emerges within the hierarchical reinforcement learning framework when chronic exposure to the drug pharmacologically produces pathologicaly persistent phasic dopamine signals. Thereby the drug hijacks the dopaminergic spirals that cascade the reinforcement signals down the ventro-dorsal cortico-striatal hierarchy. Neurobiologically, our theory accounts for rapid development of drug cue-elicited dopamine efflux in the ventral striatum and a delayed response in the dorsal striatum. Our theory also shows how this response pattern depends critically on the dopamine spiraling circuitry. Behaviorally, our framework explains gradual insensitivity of drug-seeking to drug-associated punishments, the blocking phenomenon for drug outcomes, and the persistent preference for drugs over natural rewards by addicts. The model suggests testable predictions and beyond that, sets the stage for a view of addiction as a pathology of hierarchical decision-making processes. This view is complementary to the traditional interpretation of addiction as interaction between habitual and goal-directed decision systems

CiteSeerX

Public Library of Science (PLOS)

City Research Online

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

FigShare

Evaluation of the Oscillatory Interference Model of Grid Cell Firing through Analysis and Measured Period Variance of Some Biological Oscillators

Author: A Guanella
A Jeewajee
A Manwani
A Reboreda
AD Redish
AD Redish
B Tahvildari
B Tahvildari
Babak Tahvildari
BG Burton
BL McNaughton
Boris S. Gutkin
C Barry
CN Boccara
D Chow
D Ham
E Fransén
Eric A. Zilli
F Sargolini
GJ Quirk
H Eichenbaum
HT Blair
HT Blair
J O'Keefe
JA White
JA White
JJ Knierim
JP Goodridge
JS Taube
JT Davie
Lisa M. Giocomo
LM Giocomo
LM Giocomo
M Fyhn
M Lengyel
M Yoshida
M Yoshida
MC Fuhs
ME Hasselmo
ME Hasselmo
Michael E. Hasselmo
Motoharu Yoshida
N Burgess
N Burgess
NM van Strien
OS Vinogradova
P Gaussier
PE Welinder
R Klink
R Klink
R Klink
RF Galán
RF Galán
RF Langston
S Marella
SJY Mizumori
T Hafting
T Verechtchaguina
TE Breen
Y Burak
Publication venue: Public Library of Science
Publication date: 01/11/2009
Field of study

Models of the hexagonally arrayed spatial activity pattern of grid cell firing in the literature generally fall into two main categories: continuous attractor models or oscillatory interference models. Burak and Fiete (2009, PLoS Comput Biol) recently examined noise in two continuous attractor models, but did not consider oscillatory interference models in detail. Here we analyze an oscillatory interference model to examine the effects of noise on its stability and spatial firing properties. We show analytically that the square of the drift in encoded position due to noise is proportional to time and inversely proportional to the number of oscillators. We also show there is a relatively fixed breakdown point, independent of many parameters of the model, past which noise overwhelms the spatial signal. Based on this result, we show that a pair of oscillators are expected to maintain a stable grid for approximately t = 5µ3/(4πσ)2 seconds where µ is the mean period of an oscillator in seconds and σ2 its variance in seconds2. We apply this criterion to recordings of individual persistent spiking neurons in postsubiculum (dorsal presubiculum) and layers III and V of entorhinal cortex, to subthreshold membrane potential oscillation recordings in layer II stellate cells of medial entorhinal cortex and to values from the literature regarding medial septum theta bursting cells. All oscillators examined have expected stability times far below those seen in experimental recordings of grid cells, suggesting the examined biological oscillators are unfit as a substrate for current implementations of oscillatory interference models. However, oscillatory interference models can tolerate small amounts of noise, suggesting the utility of circuit level effects which might reduce oscillator variability. Further implications for grid cell models are discussed

CiteSeerX

Public Library of Science (PLOS)

Crossref

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

PubMed Central

Temporal-Difference Reinforcement Learning with Distributed Representations

Author: A Johnson
A Johnson
A Kacelnik
A. David Redish
AD Redish
AD Redish
AD Redish
AG Barto
AG Sanfey
AL Odum
AM Graybiel
AV Beylin
B Reynolds
CD Fiorillo
CD Fiorillo
CD Fiorillo
CR Gallistel
D Read
D Self
DC Rubin
DC Rubin
DI Laibson
DW Stephens
E Pastalkova
EA Ludvig
EA Ludvig
F Wörgötter
G Ainslie
G Ainslie
G Ainslie
G Thibaudeau
GD Stuber
GE Alexander
GE Alexander
GJ Madden
HM Bayer
HM Bayer
I Pavlov
J Gibbon
J Mazur
J Mirenowicz
J Mirenowicz
JC Jackson
JE Mazur
JER Staddon
JF Cheer
JJ Day
JP O'Doherty
JP O'Doherty
JR Hollerman
JR Norris
K Doya
K Doya
K Doya
K Doya
K Samejima
K Samejima
M Bertin
M Kawato
MF Roitman
N Schweighofer
N Schweighofer
N Schweighofer
ND Daw
ND Daw
ND Daw
ND Daw
NJ Mackintosh
NM Petry
Olaf Sporns
P Brémaud
P Dayan
P Dayan
PD Sozou
PEM Phillips
PL Strick
PR Montague
PR Solomon
PS Kaplan
R Bellman
RA Rescorla
RE Suri
RE Suri
RE Vuchinich
RJ Herrnstein
RM Wightman
RN Cardinal
RS Sutton
RS Sutton
RS Zemel
S Kakade
SC Tanaka
SC Tanaka
SH Mitchell
SJ Badtke
SM Alessi
SM McClure
SN Haber
T Das
T Kalenscher
T Ljungberg
TJ Shors
W Schultz
W Schultz
W Schultz
W Schultz
W Schultz
W Schultz
WB Levy
WB Levy
WX Pan
Y Niv
Zeb Kurth-Nelson
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these issues in the context of a TD RL model in which state-belief is distributed over a set of exponentially-discounting “micro-Agents”, each of which has a separate discounting factor (γ). Each µAgent maintains an independent hypothesis about the state of the world, and a separate value-estimate of taking actions within that hypothesized state. The overall agent thus instantiates a flexible representation of an evolving world-state. As with other TD models, the value-error (δ) signal within the model matches dopamine signals recorded from animals in standard conditioning reward-paradigms. The distributed representation of belief provides an explanation for the decrease in dopamine at the conditioned stimulus seen in overtrained animals, for the differences between trace and delay conditioning, and for transient bursts of dopamine seen at movement initiation. Because each µAgent also includes its own exponential discounting factor, the overall agent shows hyperbolic discounting, consistent with behavioral experiments

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

Grid Cells, Place Cells, and Geodesic Generalization for Spatial Reinforcement Learning

Author: A Alvernhe
A Gorchetchnikov
A Johnson
A Johnson
A Johnson
A Samsonovich
AA Fenton
AD Redish
AD Redish
AD Redish
BL McNaughton
BL McNaughton
BL McNaughton
C Barry
C Barry
C Molter
CB Canto
D Derdikman
DJ Foster
DJ Foster
DM Finch
E Tolman
EA Ludvig
EA Zilli
EI Moser
ET Rolls
F Sargolini
G Dragoi
G Konidaris
H Mhatre
HT Blair
IR Fiete
J Houk
J Jeanblanc
J O'Keefe
J O'Keefe
J O'Keefe
J O'Keefe
JB Tenenbaum
JW Sammon
K Doya
K Doya
KB Kjelstrup
KI Blum
KJ Jeffery
Konrad P. Kording
LH Corbit
M Franzius
M Fyhn
MA Brown
MC Fuhs
ME Hasselmo
ME Hasselmo
ME Hasselmo
ME Hasselmo
ME Hasselmo
ME Hasselmo
MR Mehta
MW Jung
N Burgess
N Daw
Nathaniel D. Daw
ND Daw
Nicholas J. Gustafson
P Dayan
P Dayan
PE Sharp
PE Sharp
PF Krayniak
PJ Best
R Floyd
R Sutton
R Sutton
RE Suri
RF Langston
RS Sutton
RU Muller
RU Muller
RU Muller
S Mahadevan
S Mahadevan
S McClure
S Totterdell
T Hafting
T Solstad
T Solstad
TJ Wills
TS Collett
VH Brun
W Gerstner
W Schultz
WE Skaggs
Publication venue: Public Library of Science
Publication date: 01/10/2011
Field of study

Reinforcement learning (RL) provides an influential characterization of the brain's mechanisms for learning to make advantageous choices. An important problem, though, is how complex tasks can be represented in a way that enables efficient learning. We consider this problem through the lens of spatial navigation, examining how two of the brain's location representations—hippocampal place cells and entorhinal grid cells—are adapted to serve as basis functions for approximating value over space for RL. Although much previous work has focused on these systems' roles in combining upstream sensory cues to track location, revisiting these representations with a focus on how they support this downstream decision function offers complementary insights into their characteristics. Rather than localization, the key problem in learning is generalization between past and present situations, which may not match perfectly. Accordingly, although neural populations collectively offer a precise representation of position, our simulations of navigational tasks verify the suggestion that RL gains efficiency from the more diffuse tuning of individual neurons, which allows learning about rewards to generalize over longer distances given fewer training experiences. However, work on generalization in RL suggests the underlying representation should respect the environment's layout. In particular, although it is often assumed that neurons track location in Euclidean coordinates (that a place cell's activity declines “as the crow flies” away from its peak), the relevant metric for value is geodesic: the distance along a path, around any obstacles. We formalize this intuition and present simulations showing how Euclidean, but not geodesic, representations can interfere with RL by generalizing inappropriately across barriers. Our proposal that place and grid responses should be modulated by geodesic distances suggests novel predictions about how obstacles should affect spatial firing fields, which provides a new viewpoint on data concerning both spatial codes

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Fine-Tuning and the Stability of Recurrent Neural Networks

Author: A Pouget
A Renart
AA Koulakov
AD Redish
AD Redish
AP Georgopoulos
BD Mensh
BM Weissman
C Eliasmith
C Eliasmith
CD Brody
CF Stevens
Chris Eliasmith
D Durstewitz
D Hebb
D Kömpf
D Robinson
David MacNeil
DB Arnold
DB Arnold
DB Arnold
DJ Amit
E Aksay
E Fransén
E Henneman
E Vasilaki
Eleni Vasilaki
G Major
G Major
H Collewijn
HS Seung
HS Seung
HS Seung
HS Seung
J Conklin
J Hardie
J Jacobs
J Park
J Porrill
JJ Hopfield
JP Goodridge
K Hess
K Zhang
K Zhang
L Dell'Osso
LR Harris
M Nikitchenko
MS Goldman
P Miller
PR Montague
R Lorente De Nó
R McCrea
R Singh
RB Weber
RF Lewis
RJ Leigh
RJ Williams
RP Rao
S Deneve
SC Turaga
T Bekolay
T Ikezu
Y Lass
Z Kapoula
Z Wang
Publication venue: Public Library of Science
Publication date: 27/09/2011
Field of study

A central criticism of standard theoretical approaches to constructing stable, recurrent model networks is that the synaptic connection weights need to be finely-tuned. This criticism is severe because proposed rules for learning these weights have been shown to have various limitations to their biological plausibility. Hence it is unlikely that such rules are used to continuously fine-tune the network in vivo. We describe a learning rule that is able to tune synaptic weights in a biologically plausible manner. We demonstrate and test this rule in the context of the oculomotor integrator, showing that only known neural signals are needed to tune the weights. We demonstrate that the rule appropriately accounts for a wide variety of experimental results, and is robust under several kinds of perturbation. Furthermore, we show that the rule is able to achieve stability as good as or better than that provided by the linearly optimal weights often used in recurrent models of the integrator. Finally, we discuss how this rule can be generalized to tune a wide variety of recurrent attractor networks, such as those found in head direction and path integration systems, suggesting that it may be used to tune a wide variety of stable neural systems

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Long-term coding of personal and universal associations underlying the memory web in the human brain

Author: AD Redish
CA Erickson
F Vargha-Khadem
G Kreiman
GV Wallenstein
H Eichenbaum
I Fried
IV Viskontas
K Sakai
K Tanaka
L Nadel
L Reddy
L Squire
LR Squire
M Bunsey
M Day
M Ison
M Moscovitch
M Moscovitch
M Rubinov
M Yanike
PJ Bayley
R Quian Quiroga
R Quian Quiroga
R Quian Quiroga
R Quian Quiroga
RE Hampson
RS Rosenbaum
S Hattori
S Higuchi
S McKenzie
S Steinvorth
S Wirth
VD Blondel
Y Naya
Y Rubner
Y Ziv
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/11/2016
Field of study

Neurons in the medial temporal lobe (MTL), a critical area for declarative memory, have been shown to change their tuning in associative learning tasks. Yet, it is unclear how durable these neuronal representations are and if they outlast the execution of the task. To address this issue, we studied the responses of MTL neurons in neurosurgical patients to known concepts (people and places). Using association scores provided by the patients and a web-based metric, here we show that whenever MTL neurons respond to more than one concept, these concepts are typically related. Furthermore, the degree of association between concepts could be successfully predicted based on the neurons’ response patterns. These results provide evidence for a long-term involvement of MTL neurons in the representation of durable associations, a hallmark of human declarative memory

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

PubMed Central

Leicester Research Archive

Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes

Author: A Dickinson
A Dickinson
A Dickinson
A Mas-Colell
A Rangel
A Shah
A Yuille
AD Redish
AD Redish
AG Barto
Amir Dezfouli
AT Welford
B Balleine
B Shiv
BW Balleine
C Vickrey
CD Adams
D Belin
D Hu
D Joel
DE Broadbent
DM Jackson
E Alluisi
E Alluisi
E Tolman
E Tolman
G Gigerenzer
G Gigerenzer
GD Carr
GH Mowbray
H Simon
H Simon
H Simon
H Tassinari
HH Yin
IM Spigel
JD Salamone
JD Sokolowski
JE Aberman
JI Gold
JL Evenden
JN Tsitsiklis
JR Taylor
JR Taylor
K Muenzinger
M Correa
M Geist
M Haruno
M Jueptner
M Jueptner
M Lyons
M Pessiglione
Mehdi Keramati
MF Brown
ML Evans
ND Daw
ND Daw
NL Munn
Payam Piray
PC Holland
PR Montague
R Dearden
R Howard
R Hyman
RE Suri
RK Mahurin
RL Buckner
RM Colwill
RM Colwill
RS Sutton
S Killcross
S Mingote
S Zilberstein
SA Ellias
SJ Julier
SM McClure
SN Haber
SN Haber
T Ljungberg
Tim Behrens
TW Robbins
W Schultz
WE Hick
Y Kosaki
Y Niv
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Instrumental responses are hypothesized to be of two kinds: habitual and goal-directed, mediated by the sensorimotor and the associative cortico-basal ganglia circuits, respectively. The existence of the two heterogeneous associative learning mechanisms can be hypothesized to arise from the comparative advantages that they have at different stages of learning. In this paper, we assume that the goal-directed system is behaviourally flexible, but slow in choice selection. The habitual system, in contrast, is fast in responding, but inflexible in adapting its behavioural strategy to new conditions. Based on these assumptions and using the computational theory of reinforcement learning, we propose a normative model for arbitration between the two processes that makes an approximately optimal balance between search-time and accuracy in decision making. Behaviourally, the model can explain experimental evidence on behavioural sensitivity to outcome at the early stages of learning, but insensitivity at the later stages. It also explains that when two choices with equal incentive values are available concurrently, the behaviour remains outcome-sensitive, even after extensive training. Moreover, the model can explain choice reaction time variations during the course of learning, as well as the experimental observation that as the number of choices increases, the reaction time also increases. Neurobiologically, by assuming that phasic and tonic activities of midbrain dopamine neurons carry the reward prediction error and the average reward signals used by the model, respectively, the model predicts that whereas phasic dopamine indirectly affects behaviour through reinforcing stimulus-response associations, tonic dopamine can directly affect behaviour through manipulating the competition between the habitual and the goal-directed systems and thus, affect reaction time

CiteSeerX

Public Library of Science (PLOS)

City Research Online

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery